Mon. Not. R. Astron. Soc. 000, ITlfl8l C20101 



Printed 12 August 2010 



(MN M£X style file v2.2) 



O 

(N 



Sunyaev— Zel'dovich observations of galaxy clusters out to 
the virial radius with the Arcminute Microkelvin Imager 7 

AMI Consortium: Jonathan T. L. Zwart 1 ' 2 !, Farhan Feroz 1 , 
Matthew L. Davies 1 , Thomas M. O. Franzen 1 , Keith J. B. Grainge 1 ' 3 , 
^ Michael P. Hobson 1 , Natasha Hurley- Walker 1 , Riidiger Kneissl 1 ' 4 , 
<^ ■ Anthony N. Lasenby 1 ' 3 , Malak Olamaie 1 , Guy G. Pooley 1 , 
■ Carmen Rodrfguez-Gonzalvez 1 , Richard D. E. Saunders 1 ' 3 , 
^ Anna M. M. Scaife 1 ' 5 , Paul F. Scott 1 , Timothy W. Shimwell 1 , 
David J. Titterington 1 and Elizabeth M. Waldram 1 

1 Astrophysics Group, Cavendish Laboratory, J. J. Thomson Avenue, Cambridge CBS OHE 

2 Columbia Astrophysics Laboratory, Columbia University, 550 West 120th Street, New York, NY 10027, USA 

3 Kavli Institute for Cosmology Cambridge, Madingley Road, Cambridge CB3 OH A 

4 Joint ALMA Office, Av El Golf 40, Piso 18, Santiago, Chile 



o 

u 



43 

I , Dublin Institute for Advanced Studies, 31 Fitzwilliam Place, Dublin 2, Ireland 

o 

N 



> 
CO 

q 

od 
o 
o 



Accepted — . Received — ; in original form 12 August 2010. 



ABSTRACT 

We present observations using the Small Array of the Arcminute Microkelvin Imager 
(AMI; 14-18 GHz) of four Abell and three MACS clusters spanning 0.171-0.686 in red- 
shift. We detect Sunyaev- Zel'dovich (SZ) signals in five of these without any attempt 
at source subtraction, although strong source contamination is present. With radio- 
source measurements from high-resolution observations, and under the assumptions 
of spherical /3-model, isothermality and hydrostatic equilibrium, a Bayesian analysis 
of the data in the visibility plane detects extended SZ decrements in all seven clusters 
over and above receiver noise, radio sources and primary CMB imprints. Bayesian ev- 
idence ratios range from 10 n :l to 10 43 :1 for six of the clusters and 3000:1 for one with 
substantially less data than the others. We present posterior probability distributions 
for, e.g., total mass and gas fraction averaged over radii internal to which the mean 
overdensity is 1000, 500 and 200, T2oo being the virial radius. Reaching r2oo involves 
some extrapolation for the nearer clusters but not for the more-distant ones. We find 
that our estimates of gas fraction are low (compared with most in the literature) and 
decrease with increasing radius. These results appear to be consistent with the notion 
that gas temperature in fact falls with distance (away from near the cluster centre) 
out to the virial radius. 

Key words: cosmology: observations - cosmic microwave background - galaxies: 
clusters: general - galaxies: clusters: individual (Abell 611, Abell 773, Abell 1914, Abell 
2218, MACSJ0308+26, MACSJ0717+37, MACSJ0744+39) - methods: data analysis 
- radio continuum: general 



1 INTRODUCTION 

The Sunya ev-Zel' dovich (SZ) effect jSunv aev Sz Zeldovichl 
Il97d . Il972h is the inverse-Compton scattering of the CMB 



f Issuing author - e-mail: jtlz2@astro.columbia.edu. *We request 
that any reference to this paper cites 'AMI Consortium: Zwart et 
al. 2010'. 



radiation by hot, ionised gas in the gravitati onal poten- 
tial w ell of a cluster of galax i es; fo r reviews see iBirkinshawl 
l|l999h and ICarlstrom et all (|2002h . The effect is useful in 
a number of ways for the study of galaxy clusters; here we 
are concerned with two in particular. First, because the SZ 
effect arises from a scattering process, a cluster at one red- 
shift will produce the same observed SZ surface brightness as 
an identical cluster at any other redshift, so that the usual 
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sensitivity issue of high-redshift observing does not arise. 
Second, since the SZ surface brightness is proportional to 
the line-of-sight integral of pressure through the cluster, the 
SZ signal is less sensitive to concentration than the X-ray 
Bremmsstrahlung signal; one corollary of this is that the ra- 
tio SZ-sensitivity / X-ray-sensitivity increases with distance 
from the cluster centre so that with SZ one can probe out to, 
say, the virial radius, provided the SZ telescope is sensitive 
to sufficiently large angular scales. 

SZ decrements are faint, however, and can be con- 
taminated or obliterated by other sources of radio emis- 
sion. A range of new, sensitive instruments has been 
brought into use to capitalise on the science from SZ ob- 
servations. Among these instruments, which employ dif- 
ferent strategies to maximise sensitiv it y and minimise 
confusion, are ACT llSwetz et all l20ld: iMenanteau et al.l 
|2010|) AMI (lAMI Consortium: Zwart et alJl200Sh AMiBA 



I Ho et alJl2~009l ; IWu et al.ll2009l ), AP EX jDobbs et alfeOOrj) 
CARMA (www.mmarra y.org), SP T dCarlstrom et al 120091 ; 
lAndersson et alj|20ich and SZA (jCulverhouse et aljkoicF . 

In the case of AMI, two separate interferometer arrays are 
used, the Small Array (SA) having short baselines sensitive 
to SZ and radio sources, and the Large Array (LA) with 
baselines sensitive to the radio sources alone and thus pro- 
viding source subtraction for the SA. Key parameters of the 
SA and LA are shown in Table [1] 



SA 



The 
the LA 

observe Galactic supernov a remnants and 
gions of s pinning dust dAMI Consortium: Scaife et al] 



was built first. Partly to test it while 
was being completed, we used the SA to 
supernov a remnants and likely re- 



gions ot spi 
20081. U 



AMI Consortium: Scaife et al.l 



2009a,b, 



AMI Consortium: Hurley- Walker et all l2009a! .b) bright 



enough not to need source subtraction. But we also wanted 
to begin SZ observation, test our algorithms to extract 
SZ signals in the presence of radio sources, CMB primary 
anisotropies and receiver noise, and begin our SZ science 
programme. To do this required the use of long-baseline 
data from the 15- G Hz Ryle Telescope (RT; see e.g. 
Grainge et all Il99l iGrainge et aO Il996l. Grainge et al.l 



2002la.b. ICotter et al l l2002la.b. lcrainger et all I2002L 
Saunders et al.1 120031 . Ijones et all 120051 ) taken in the 



past; 
ity (see 



this nee ds caution because of radio sour c e var iabil- 
Bolton et~aT] l200rj ISadler et all 120061 and 
I AMI Consortium: Franzen et al.l |2009| ). but our data- 
analysis algorithm allows for variability and in fact we were 
able to use some data from the LA, which, at the time, 
was only partially commissioned. Here we present the first 
part of this work, SZ measurements of seven known clusters 
spanning ranges of redshift z and of X-ray luminosity Lx ■ 

We assume a concordance ACDM cosmology, with 
fi m = 0.3, Qa = 0.7 and Ho = 70 km s~ x Mpc^ 1 . 
However, in plots of probability distribution, we explic- 
itly include the dimensionless Hubble parameter, defined as 
h = Ho/ (lOOkms" 1 Mpc -1 ), to allow comparison with 
other work. All coordinates are J2000 epoch. Our conven- 
tion for spectral index a is S v tx v~ a where 5* is flux density 
and v is frequency. We write the radius internal to which 
the average density is a times the critical density p cr it at 
the particular redshift as r a , the total mass (gas plus dark 
matter) internal to r a as M a , and the gas mass internal to 

T a aS -/Vfg as a . 



Table 1. AMI dAMI Consortium: Zwart et al.l l2008h technical 
summary. 



SA 



LA 



Antenna diameter 


3.7 m 


12.8 m 


Number of antennas 


10 


8 


Baseline lengths (current) 


5-20 m 


18-110 m 


Primary beam (15.7 GHz) 


20.'1 


5.'5 


Synthesized beam 


« 3' 


« 30" 


Flux sensitivity 


30 mjy s 1 / 2 


3 mjy s 1 / 2 


Observing frequency 


13.9-18.2 


GHz 


Bandwidth 


4.3 GHz 


Number of channels 


6 




Channel bandwidth 


0.72 GHz 



2 CLUSTER SELECTION AND RT 
OBSERVATION 

We used the NOrthe rn ROSAT All-Sky Survey (NORAS, 
iBohringer et al.ll2000l ) catalogue as a source of low-redshift 
(z < 0.3) clusters, and the MAssiye Cluster Survey (MACS , 
lEbeling et afll200ll . lEbeling et al1l2007l . lEbeling et al1l2010l ) 
to give secure, more-distant clusters that provide some 
filling-out of the Lx~z plane. We restricted redshifts to 
z > 0.1 to avoid resolving out SZ signals, and luminosity 
to L x > 7 x 10 37 W (0.1-2.4 keV, rest frame). 

We restricted declinations to greater than 20° since the 
RT had only East- West baselines, an d further excluded cl us- 
ters which we knew, from the NVSS dCondon et al.lll998T l or 
from archival RT data, would be too contaminated by radio 
sources. Details of the resulting seven clusters in this work 
are given in Table [2] Source surveying of the remaining clus- 
ters with the compact array of the RT - note that this array 
contained five of the eight antennas of the LA - was then 
carried out as follows. 

The RT data were obtained between 2004 and 2006. 
Each cluster field was surveyed in two ways: with a wide 
shallow raster and a deep central one. The wide shallow 
raster comprised a hexagonal close-packed raster of 11 x 12 
pointings on a 5' grid, with a dwell time at each pointing 
of eight minutes; the aim was to identify relatively bright 
radio sources in the direction of an SA pointing. The centre 
of each cluster was followed up with a hexagon of 7x 12-hour 
RT pointings, on a 5' grid, in order to detect faint sources 
near the target cluster. 

Data were reduced, and point-source positions and 
fluxes extracted, us ing procedures develop ed for the 9C sur- 
vey and outlined in lWaldram et all (|2003l ). The source data 



are given in Table [3] 



3 AMI OBSERVATION AND REDUCTION 

The seven clusters were observed with the SA between 
2007 October and 2008 January. Each cluster typically 
had 25 hours of SA observing on the sky (though A2218, 
MACSJ0308+26 and MACSJ0717+27 had some 70 hours). 
The w-coverage is well-filled (Figure [TJ all the way down to 
~180A, corresponding to a maximum angular scale of « 10'. 
This is a significantly greater angular scale than is achiev- 
able with OVRO/BIMA, the RT, or the SZA. 

Calibration and reduction procedures were as follows. 
One of our two absolute flux calibrators, 3C286 and 3C48, 
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Table 2. Clusters in this w ork. Temperatures, re d shifts and X-ray lumino sit ies are from ^LaRoque et al.l feOOrjl , 2 iBalestra et al.l {20071 
3 iBonamente et ail i2008l l 4 lEbeling et al.l (1200711 . feohrineer et al.l feOOdl , ^Struble fc Rood! lll999ll . 7 Ebeling (Priv. Comm.). The map 
noise indicated is for a SA naturally-weighted map with all baselines and no source subtraction. The integration times tj nt are on-sky 
times, and do not account for variations in system temperature with airmass or poor weather, or for the amount of data flagged due to, 
for example, shadowing. 



Cluster 




RA (J2000) 


Dec (J2000) 


z 


T c /keV 


L X /10 37 W 


tint /hours 


rms //i Jy 


A611 




08 00 59.40 


+36 03 01.0 


0.288 (5) 


6.79+-- (1) 


8.63 (5) 


23.8 


140 


A773 




09 17 52.97 


+51 43 55.5 


0.217 (6) 


s-ietjy (1) 


12.11 (5) 


23.8 


160 


A1914 




14 26 02.15 


+37 50 05.8 


0.171 (5) 


B.«tS:" (i) 


15.91 (5) 


20.9 


110 


A2218 




16 35 52.80 


+66 12 50.0 


0.171 (5) 


7.80±°;£(l) 


8.16 (5?) 


62.4 


90 


MACSJ0308H 


-26 


03 08 55.40 


+26 45 39.0 (7) 


0.352 (7) 


11-21?-? (2) 


15.89 (7) 


86.6 


140 


MACSJ0717H 


-.37 


07 17 30.00 


+37 45 00.0 (7) 


0.545 (4) 


ll.6l°; 5 5 (4) 


25.33 (7) 


23.8 


160 


MACSJ0744H 


-39 


07 44 48.00 


+39 27 00.0 (7) 


0.686 (4) 


8.141°-?° (1) 


17.16 (7) 


71.8 


320 



Table 3. Contaminating sources. W denotes RT wide, shallow raster (11x12 pointings), while H denote s a RT deep hexagon (7 pointings). 
Fluxes from RT shallow raster obser vations were boosted by 10 per cent to account for pointing errors jWaldram e t al. 2003). 9C denotes 
data from 9C pointed observations jWaldram et alj|2003l1 , with the flux error estimated at < 5 per cent. 



Cluster 




RA (J2000) 


Dec (J2000) 


Array 


Mode 


S/mJy 


A611 


1 


08 00 43.28 


+36 14 00.9 


SA 




5.5 ± 1.7 




2 


08 00 09.91 


+36 04 15.4 


SA 




4.4+ 1.3 


A773 


1 


09 18 38.29 


+51 50 25.0 


SA 




4.4 + 0.4 




2 


09 17 06.13 


+51 44 54.9 


SA 




3.4 + 0.3 




3 


09 17 57.02 


+51 45 08.0 


LA 




0.12 ± 0.01 




1 


09 18 01.33 


+51 44 13.1 


LA 




0.32 + 0.03 




5 


09 17 45.31 


+51 43 04.6 


LA 




0.22 + 0.02 




6 


09 17 55.58 


+51 43 01.1 


LA 




0.19 + 0.02 




7 


09 17 50.67 


+51 41 06.1 


LA 




0.31 ± 0.03 


A1914 


1 


14 25 10.21 (SA) 


+37 52 35.1 (SA) 


SA/LA 




4.2 + 0.4 (LA) 




2 


14 27 24.75 (RT) 


+37 46 33.8 (RT) 


RT/LA 




9.7+ 1.0 (LA) 




3 


14 25 48.02 


+37 47 50.3 


LA 




1.0 + 0.3 




4 


14 25 40.84 


+37 45 50.4 


LA 




3.7 + 0.4 




5 


14 25 50.53 


+37 45 10.3 


LA 




0.61 ± 0.18 




6 


14 25 58.53 


+37 44 00.1 


LA 




0.60 + 0.18 




(7) 


14 25 50.53 


+37 45 10.3 


SA 




4.3+ 1.3 


A2218 


1 


16 35 47.24 


+66 14 46.9 


RT 


H 


1.9 + 0.6 




2 


16 36 15.74 


+66 14 27.0 


RT 


H 


1.9 + 0.6 




3 


16 35 22.14 


+66 13 20.6 


RT 


W 


5.6+ 1.7 




1 


16 33 18.18 


+66 00 50.6 


RT 


w 


10 + 3 




5 


16 35 39.78 


+65 58 12.0 


RT 


w 


11 + 3 




6 


16 34 46.36 


+65 55 18.6 


RT 


w 


13 + 4 




7 


16 37 22.56 


+66 21 18.4 


SA(L) 




5.2+ 1.6 


MACSJ0308+26 


1 


03 09 42.02 


+26 56 30.3 


9C 


w 


8 + 2 




2 


03 08 56.52 


+26 44 54.0 


SA(L) 




2.4 + 0.7 




3 


03 09 40.14 


+26 37 23.6 


SA(L) 




2.9 + 0.9 


MACSJ0717+37 


1 


07 17 36.09 


+37 45 56.3 


RT 


H 


2.1 ± 0.3 




2 


07 17 35.91 


+37 45 11.2 


RT 


H 


1.8 + 0.5 




3 


07 17 37.14 


+37 44 23.1 


RT 


H 


3.9+ 1.2 




1 


07 17 41.06 


+37 43 15.2 


RT 


H 


2.5 + 0.8 




5 


07 18 10.51 


+37 49 14.6 


SA(L) 




18 + 6 




6 


07 16 35.69 


+37 39 14.2 


SA(L) 




4.7+ 1.4 


MACSJ0744+39 


1 


07 44 32.95 


+39 32 15.0 


RT 


H 


2.8 + 0.2 




2 


07 44 22.30 


+39 25 46.5 


RT 


H 


1.1 ± 0.2 




3 


07 43 58.76 


+39 15 02.3 


RT 


W 


52.0 + 1.7 




4 


07 43 45.99 


+39 14 21.5 


RT 


w 


8.3+ 1.7 



was observed immediately before or after each cluster obser- 
vatio n. The absolute flux calibration is accurate to 5 p er cent 
(see lAMI Consortium: Hurley- Walker et all l2009bl ). Each 
cluster observation was reduced separately using our in- 
house software reduce. An automatic reduction pipeline 
is in place, but all the data were examined by eye for prob- 
lems. Data were nagged for shadowing, slow fringe rates, 
path-compensator delay errors and pointing errors. The data 



were flux-calibrated, Fourier transformed and fringe-rotated 
to the pointing centre. Further amplitude cuts were made in 
order to remove interference spikes and discrepant baselines. 
The amplitudes of the visibilities were corrected for varia- 
tions in the system temperature with airmass, cloud and 
weather, and the data weights converted into Jy -2 . Sec- 
ondary (interleaved) calibration was applied, by observing 
a point-source calibrator every hour, to correct for system 
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phase drifts. The data were smoothed from one-second to 
10-second samples, and calibrated uvfits were outputted 
and co-added using pyfits. Typically 20-30 per cent of the 
data were discarded due to bad weather, telescope downtime 
and other flagging. The data were mapped in AIPS and also 
directly analysed in the visibility plane. 

In some cases, as indicated in Table[3] it was possible to 
use some of the then partially commissioned LA for source 
subtraction, assisting with any effects of the time gap be- 
tween RT and SA observations (LA calibration and reduc- 
tion are very similar to that of the SA, described above). 
Similarly, for some sources of high flux density away from 
the cluster, the long baselines of the SA provided useful mea- 
surements. 



3.1 Maps 

We used standard MPS tasks to produce naturally weighted 
SA maps with all baselines, no taper and no source sub- 
traction. These images, after CLEANing, are shown in Fig- 
ure [2] The maps have differing noises due largely to differ- 
ing integration times. Sources are evident in all the maps. 
In five of the maps, an extended SZ decrement is visible, de- 
spite major source contamination at the X-ray centres in the 
cases of A2218 and MACSJ0308+26. In MACSJ0717+37, 
there seems to be some negative signal b ut the source con- 
tamination at the map centre is severe (|Edge et all 120031 ; 
lEbeling et aT]|2004 l. In MACSJ0744+39, the contamination 
is less but there is still only a weak decrement - but we note 
that the thermal noise is at least twice that of every other 
map. 

Subsequent analysis was carried out in the visibility 
plane, taking into account radio sources, receiver noise and 
primary CMB contamination, as we describe in the next 
section. 



4 RESUME OF ANALYSIS 
4.1 Bayesian analysis 

Bayesian analysis of interferometer observations of clus- 
ters in SZ has been discu ss ed by us previ o usly i n e.g. 
Hobson fc Maisingerl (|2002l ). iMarshall et~al1 ([2003 ) and 
Feroz et al l (|2009h . The advantages of this approach are as 
follows. 



• One infers the quantity that one actually wants, the 
probability distribution of the values of parameters 0, given 
the data D and some model, or hypothesis, H, via Bayes' 
theorem: 



Pr(0|D,_fJ) 



Pr(D|©,#) Pr (&\H) 
Pr (D\H) ' 



(1) 



• The likelihood Pr(Dj0,ff) is the probability of the 
data given parameter values and a model, and encodes the 
constraints imposed by the observations. It includes infor- 
mation about noise arising from the receivers, primary CMB 
and unsubtracted radio sources lying below the detection 
level of the source-subtraction procedure. 

• The prior Pr(0|_H") allows one to incorporate prior 
knowledge of the parameter values and, for example, allows 



one to deal fully and objectively with the contaminants such 
as sources (which may be variable). 

• The evidence Pr(D\H) is obtained by integrating 
Pr (D|0, H) Pr (&\H) over all 0, allowing normalization of 
the posterior Pr(0|D,ff). One can select different models 
by comparing their evidences, the process automatically in- 
corporating Occam's razor. 

• However, performing these integrations, and sampling 
the parameter space, is non-trivial and can be slow. The 
use of the 'nested sampler' algorithm MultiNest both 
speeds up the sampling process significantly and, more im- 
portantly, allows one to sample from probability distribu- 
tions with multiple pea ks and/or large curving degeneracies 
jFeroz fc Hobsonll2008r i. 

• Throughout the whole analysis, probability distribu- 
tions - with their asymmetries, skirts, multiple peaks and 
whatever else - are used and combined correctly, rather than 
discarding information (and, in general, introducing bias) 
by representing distributions by a mean value and an uncer- 
tainty expressed only in terms of a covariance matrix. 



4.2 Physical Model and Assumptions 

We restrict ourselves to the simplest model, by assum- 
ing a spherical /3- model for isothermal (see section I4.3|) . 
ideal cluster gas i n hydrostatic equilibrium. Following e.g. 
iGrego et al.l (|200lT ). the equation of hydrostatic equilibrium 
for a spherical shell of gas of density p at pressure p, a radius 
r from the cluster centre is 



dp(r) _ GM r p(r) 



(2) 



dr r 2 

where M r = M (< r) is the total mass (gas plus dark mat- 
ter) internal to radius r and the gas' density distribution 
p (r) is 



p(r) 



p(r = 0) 



[1 + (r/rc) 2 ] 



21 3,8/2 1 



(3) 



The density profile has a flat top at low r/r c (with r c the core 
radius), then turns over, and at large r/r c has a logarithmic 
slope of —3/3. The profile may be integrated to find the gas 



mass Mg a s within r. 



One also requires the equation of state of the gas, 
i.e. p (p). For ideal gas, p — -k B T, with p the effective mass 



of protons per gas particle (we take p 
((2| becomes 



_d_ 

dr 



pk B T 



GM r p 



0.6m p ), equation 



(4) 



and one obtains 

k B T r 2 dp 



3/3r 3 k B T 



pG p dr r 2 + r 2 pG 



(5) 



4.3 Priors used here 



The forms of the priors we have assumed for cluster and 
source parameters are given in Table [4] Positions x c , red- 
shifts z and gas temperatures T c for individual clusters are 
quoted in Table [5] For the sources, positions and fluxes 
Si are in Table[3j and aj is the 15-22 GHz probability kernel 
for source spectral index. Note that for radio sources, we use 
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Figure 1. SA ^-coverage for A2218; coverages for the other clusters are very similar to this. The different colours correspond to different 
frequency channels. 



5-functions on source positions since the position error of a 
source is much smaller than an SA synthesized beam, while 
for source fluxes, we use a Gaussian centred on the flux den- 
sity from high-resolution observations with a 1-a width of 
± 30 per cent to allow for variabilit y, but for A773 w e later 
tighten the prior on source flux fsee iFeroz et al.ll2009l for de- 
tails) . We next comment on our use of a single temperature 
for each cluster. 

Most SZ work so far has concentrated on the inner parts 
of clusters, but as one moves to radii larger than, say, r25oo 
the observational position on T B (r) seems to be unclear. The 
following examples from the literature attempt to measure 
T e (r) o ut to about h alf the classical virial radius, i.e. half 
of ri8o ([Peebles 1993]) , in samples of clu s ters. In 30 clusters 
observed with ASCA, iMarkevitch et all i|l998ri find that on 
average T c drops to about 0.6 of its centr al value by . 5ri8q . 
Using ROSAT observations of 26 clusters. Ilrwin et ail (|T999l ') 
rule out a temperature drop of 20 per cent at 10 keV within 
0.35ri8o at 99 per cent confidence. With Bep p oSAX ob- 
servations of 21 clusters, |Pe Grandi fc Molendil (2002) find 
that on average T c falls to about 0.7 of its central value 
by .5ri8o- With Chandra obervations of 13 relaxed clus- 
ters, IVikhlinin et al. I (|2005l ) find that on average T e falls by 
about 40 per cent between 0.15ri8o and 0.5ri8o but with 
near-flat exceptions. In XMM-New ton observations of 48 
clusters, iLeccardi fc Mole ndi ( 2008]) find that most have T e 
falling by 20-40 per cent from 0.15ri8o to 0.4riso but that a 
minority are flat. U sing XMM-Newton data on 37 clusters, 
IZhang et all (|2008h find that T c (r) is broadly flat between 
0.02r 50 o and lr 50 o. 

We have tried to find measurements in the literature of 



T e (r) out to large r for our seven cluster s, with the following 
result s. Using Chandra data on A611, iDonnarumma et al.l 
(2010) find that T c peaks at 200 kpc and falls to 80 per cent 
of the peak a t 600 kpc. W e coul d not find a radial profile 
for A773, but lGovoni et all (|2004fj show a temperature map 
from Chandra out to 400 kpc radius; assessing this purely 
by eye, we estimate that the mean T c is about 8 keV with 
hotter and colder p a tches but no clear radial trend. For 
A1914, IZhang et ail l|2008h find from XMM-Newton data 
that T e (r) is flat from 150 to 900 kpc, while on the other 
hand lMroczkowski et alj (|2009T ) find from Chandra data that 
T e (r) falls from 9 keV at 0.2 Mpc to 6.6 keV at 1.2 Mpc. 
For A2218. |Pratt et al.l l|2005h find from XMM-Newton data 
that T c (r) falls from 8 keV near the centre to 6.6 keV at 
700 kpc. Unsurprisingly, we have been unable to find T e (r) 
estimates for our MACS clusters, which are distant. 

X-ray analysis at large r is of course hampered by un- 
certainty in the background. The satellite Suzaku has a low 
orbit which results in some particle screening by the Earth's 
magne tic field and thus a low background. iGeorge et al.l 
(2009) find that in cluster PKS0745-191, T e (r) falls by 
roughly 70 per cent from 0.3r2oo to r2oo with no extrap- 
olation of the data in r and indeed going b eyond r2oo, and 
iBautz et ail (|2009T ) and lHoshino et al.1 l|2010l ) find somewhat 
similar behaviour in respectively A1795 and A1413. As far 
as we know, these are as yet the only relevant X-ray obser- 
vations that extend to very large r. 

In view of the foregoing, we chose to assume isothermal- 
ity (at the temperatures given in Table [2}, and to examine 
the consequences in this case. 
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Table 4. Fitted parameter names and priors for the cluster analysis. The 15-22 GHz probability kernel for source spectra is a;. 



Cluster: 


x c 


Gaussian, <r = 1.0' 


z 


5-function 


r c 


Uniform, 10-1000 kpc h~ x 


P 


Uniform, 0.3—1.5 


T B 


Gaussian, value from literature ±15% 


-Mgas,200 


Uniform in log-space, (0.01-5.00) X 1O 14 M h~ 2 


Radio sources: 




5-function 


Si 


Gaussian, ±30 per cent 


Oti 


Smoothed version of that in Waldram et al. (2007) 
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(a) A611. The 1-<t map noise is 139 /^Jy. Contour levels 
start at ±280 /ijy and increase at each level by a factor 
of s/2. 




RIGHT ASCENSION (J20001 



(b) A1914. The 1-a map noise is 144 fijy. Contour lev- 
els start at ±290 fijy and increase at each level by a 
factor of \/2. 




(c) A773. The 1-a map noise is 157 fijy. Contour levels 
start at ±310 /ijy and increase at each level by a factor 
of VI. 



(d) A2218. The 1-a map noise is 88 fijy. Contour levels 
start at ±180 fijy and increase at each level by a factor 
of V2. 



Figure 2. SA naturally-weighted maps of the Abell clusters. No source subtraction has been done. The synthesized beam is indicated 
in the lower left corner of each image. 
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(a) MACSJ0308+26. The 1-a map noise is 141 //Jy. 
Contour levels start at ±280 fijj and increase at each 
level by a factor of %/2- 



(b) MACSJ0717+37. The 1-a map noise is 161 /iJy. 
Contour levels start at ±320 fijj and increase at each 
level by a factor of \/2. 




07 47 46 45 44 43 

RIGHT ASCENSION (J2000] 



(c) MACSJ0744±39. The 1-a map noise is 317AtJy. 
Contour levels start at ±630 /ijy and increase at each 
level by a factor of \/2. 

Figure 3. SA naturally-weighted maps of the MACS clusters. No source subtraction was undertaken for these images. The synthesized 
beam is indicated in the lower left corner of each image. 



5 EVIDENCES 

We consider two basic models, as follows. The first model 
consists of hypothesis Hi that the data support thermal 
and CMB noise plus a number of contaminating radio 
sources, together with priors on source parameters. The sec- 
ond model consists of hypothesis H2 that the data support 
the two noise contributions plus the contaminating sources 
and also a cluster in the SZ with a /3-profile, plus priors 
on the fitted parameters. We have carried out the analysis 
in two stages: first, determining the best modelling of the 
source contributions in each cluster field; and second de- 



termining in each field the extent, if any, to which H2 is 
supported over Hi. 



5.1 Source model selection 

Inside each of Hi and H2, we can consider different models 
for the field of contaminating sources. We now discuss the 
use of the Bayesian evidence for model selection in the two 
cases (A773 and A1914) for which source observations had 
suggested a possible choice of source model. 



8 Zwart et al. 



5.1.1 A773 

The models for A773 all include seven point sources: none 
was detected with the RT, two were found in the SA data 
and five were found with subsequent LA observations (see 
Table [3]). We compared two models, in which the flux un- 
certainties were ±30 per cent, to allow for variability, and 
another in which the flux uncertainties were reduced to ±f0 
per cent. We carried out a Bayesian analysis run for the 
first model and another for the second. The difference in the 
log e -evidence was 1.20 ± 0.11, marginally favouring the 10- 
per cent model; that is, the odds in favour of the 10-per cent 
model over the 30-per cent model are 3.3± 1.1 to 1. There is 
thus little to choose between the models. For A773 we have 
used the 10-per cent model but kept the 30-per cent model 
for the other clusters. 

5.1.2 A1914 

For A1914, we consider three source models, all of which 
have one source from the SA long baselines and four sources 
detected with the LA. In one of the models (A) we include 
an RT-detected source; in a second (B), the flux for that 
source is taken from the LA data (which were taken much 
closer in time to the SA observations), and the errors are 
tightened; in the third model (C), a further source (source 
7) that is possibly detected by the SA is also included. The 
relative log e -evidences for each model with respect to model 
C and given H2 are shown in Table [5] 

Model C, which includes the source candidate possibly 
detected by the SA, is overwhelmingly disfavoured relative 
to the two models (A and B) that have only six sources, and 
we discard model C. 

Of the two models with six sources, model B, in which 
the point-source flux errors are tightened, is favoured (rela- 
tive to model A) by an odds ratio of e 4 ' 49±0 ' 16 . Consequently 
we select model B as the preferred model for parameter es- 
timation. Once again we see that the Bayesian evidence is a 
useful and straightforward tool for model selection in cases 
where we want to test for source detection and errors on 
prior fluxes. 

5.2 Cluster Detections 

For each cluster, the log e -evidence difference AZ for H2 over 
Hi, that is, the log e -evidence for an SZ signal over and 
above (thermal noise plus CMB primary anisotropies plus 
the sources) for each cluster model are shown in Table [5] 
Thus the evidence ratios, given by E — expAZ, are huge 
(ranging from 10 11 to 10 43 ) except for MACSJ0744+39. For 
this cluster, E is about 3000, i.e. there is a 1 in 3000 chance 
that the SZ detection is spurious; note that this is the clus- 
ter for which the thermal noise is at least twice that of any 
of the others. Of course, we know from optical and/or X- 
ray that a cluster is present in each case. Thus the high 

Table 5. Relative evidences for different source models for A1914. 

Model Sources Relative log c -evidence 

A 6 5.56 ±0.19 

B 6 10.05 ±0.17 

C 7 0.0 



-E-values indicate the power of the observing plus analysis 
methodology for detecting SZ even in the presence of serious 
source confusion. The methodology works even with sub- 
stantial uncertainty on the source fluxes but requires that 
the existences of the sources, in approximately the right po- 
sitions, are correctly determined. 



6 PARAMETER ESTIMATES AND 
DISCUSSION 

The full posterior probability distributions for the seven 
clusters are shown in Figures [51 4101 In each figure, the up- 
per panel shows the posterior distributions for the fitted 
parameters, marginalized into two dimensions, and into one 
dimension along the diagonal; the lower panel shows the 
one-dimensional marginalized posterior distributions for pa- 
rameters derived from those that were fitted. In Table [7] we 
give mean a posteriori parameter estimates for the clusters, 
but we caution against their use independently of the pos- 
terior probability distributions. 

There are two technical points of which to be aware. 
First, some of the distributions have rough sections. This 
roughness is just the noise due to the finite numbers of sam- 
ples. We have used narrow binning of parameter values to 
avoid misleading effects of averaging especially at distribu- 
tion edges, with the consequence of high noise per bin. Sec- 
ond, there is a possibility that, for some combination of clus- 
ter parameters, nowhere in the cluster does the density reach 
a x pcrit, resulting in no physical solution for r a . We set r a 
to zero in such cases. Out of the seven clusters analysed in 
this paper, this affected only MACSJ0744±39, resulting in a 
sharp peak in the posterior probability of riooo/'i - Mpc and 
raoa/h~ Mpc close to zero radius. Consequently the poste- 
rior probability also peaks close to zero for derived parame- 
ters /1000//1" 1 , hoa/hT 1 , Miooo//i _1 M Q and M 500 /h- 1 M e , 
for this cluster. A different SA configuration or more inte- 
gration would help for MACSJ0744±39, but at mean over- 
density 200 there is no issue. 

To set these results in context, we give examples from 
the literature of other estimates of some of these quantities 
that we can fi nd for these cluste rs. 

For A611. ISchmidt fc Allen] (|2007t ) using Chandra find 
a total virial mass of 6. 2±? g x 10 14 M o . From gravitational 
lensing, iRomano et al l l|201(tl find r2oo is some 1.5 Mpc and 
total mass is some 4-7xlO M M(T) 

For A773, IZhang et all l|2008h find from XMM-Newton 
that r 50 o is 1.3 Mpc, M500 is 8.3 ± 2 5 x 1 14 M B and / fl , 50 o 
is 0.13 ± 0.07, while UTarrena et al.l (|2007t ) estimate a virial 

Table 6. For each cluster, the log e -evidence AZ for an SZ signal 
in addition to (thermal noise plus CMB primary anisotropies plus 
the n sources). 



Cluster 


n 


AZ 


A611 


2 


27.27 ± 0.12 


A773 


7 


27.13 ± 0.09 


A1914 


6 


64.84 ± 0.11 


A2218 


7 


92.26 ± 0.23 


MACSJ0308+26 


3 


47.59 ± 0.13 


MACSJ0717+37 


6 


33.90 ± 0.19 


MACSJ0744+39 


4 


7.88 ± 0.16 
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mass of 1.2-2.7 x 1O 15 M0 from Chandra and optical-spectral 
velocities. 

For A1914. IZhang et all ll2008h find from XMM-Newton 



that r 5 oo is 1. 7 Mpc, M 500 is 16.8 ±4.9 X 1O 14 M and / ff , 5 oo 
is 0.07 ± 0.04. iMroczkowski et all ([2009T ) fit jointly to Chan- 
dra and SZA data and find r 2 oo is 1.3 Mpc, M 50 o is 6.6- 
8.1 x 1O 14 M , and / ff ,500 is 0.14-0.16, the exact values de- 
pending on assum ptions, with random errors in addition. 
I Zhang et all (|2010l ) find from XMM-Newton that M 1000 is 
4.36±1.22xlO 14 M and M 500 is 7.69±2.24x 1O 14 M , while 
from weak lensing they find that Miooo is 3.351q[ 4 7 xlO 14 M 
and Msoo is 4.4 6±^ x 1O 14 M 



For A2218, IZhang et all ([20081 ) find from XMM-Newton 
that r 50 o is 1.1 Mpc, M 500 is 4.2 ± 1.3 x 10 14 Af Q and / 9 , 50 o 
is 0.15 ±0.09. 

For MACSJ0744+39, lEttori et all ([20091 ) find from 
Ch andra that r2oo is 1566 ± 56 kpc, and also from Chan- 
dra ISchmidt fc Alien! (|2007l ) find a virial mass of 7.4l 4 ' 4 x 
1O 14 M . 

Returning to our results, three points that are imme- 
diately apparent are that: the gas fractions are low and get 

degener- 



lower as r increases; as well as the usual 



_IB- 

acv dGrego et al.ll200ll ; iGrainge et ai1l2002l ; ISaunders et~aH 
2003), there is a tendency to high /3; and the results go out 
to larger radius than typically obtained from X-ray or SZ 
cluster analyses. We next consider these points in more de- 
tail. 

6.1 Masses and gas fractions 

Rather than rising towards a canonical large-scale gas 
fra ction of, say, 0.15 as one goes to large r (see 
e.g.lMcCarthv et al.ll2007l . iKomatsu et~ai]|2009l , lEttori et all 
2009), our f g values are low and get smaller as r increases. 
We suspect that our assumption of isothermality may be the 
cause. If, away from the central region, T c (r) keeps falling 
as r increases, then of course our isothermal assumption is 
invalid. The consequences of this for estimating M and f g 
are however somewhat worse than we initially expected, for 
the following reason. In the literature, it is assumed that the 
value for M r based on hydrostatic equilibrium (equation ([5]) 
in this work) implies M r oc T e . But one has to use equation 
([5]) in terms of radius r a internal to which there is a specific 
mean overdensity a. At a particular r a , one can equate M r 
from equation (JSJ with the expression for M r from integrat- 
ing over spherical shells, finding that r a oc T 1 / 2 and in fact 
M r oc T 3 / 2 (please note our stated convention at the end 
of section [TJ). Since M gas , r oc T~ 1 (given the SZ measure- 
ment), /gaa.r is proportional to y -5 / 2 rather than the T~ 2 
in the literature. It is not possible here to make an approx- 
imate quantitative estimate of the effects of the isothermal 
assumption because of its separate effects on r c , on /3, and 
on total and gas masses as functions of r. Nevertheless, if 
temperatures are less than we have assumed, our total mass 
estimates are biased high, our gas fraction estimates are bi- 
ased low, and our r a estimates are somewhat biased high. 

6.2 Reaching high radius 

lLacev fc Colel l| 19931 ) give an expression for how the classical 
virial radius (ri78 at z = 0) changes with z in an Q, — A uni- 
verse: for our lowest and highest cluster redshifts, the virial 



radii are approximately r2os and r2is. The SA's sensitivity 
to structures out to diameters of 10'corresponds to sensitiv- 
ity to a physical diameter of 1.7 Mpc at our lowest cluster 
redshift. Given that our r2oo estimate is biased high, our 
plots at overdensity 200 thus reach the virial radius in our 
nearer clusters with some extrapolation of the SZ signal and 
with no extrapolation in the more-distant ones. 



6.3 ft 

Typ ical low-r /3-value s are about 0.7 (see 
e.g. [Jones fc Formanl [19841 : iMohr et al. 1 Il999l: lEttori et al ' 



120041) and reach about 0.9 by riooo (see e.g. IVikhlinin et al 
1999; Hall man et al.ll2007l ). Despite the /3-r core degeneracy, 
when we marginalize over everything but ft we find that ft 
is much larger. The two likely reasons for this are that our 
data go to high r and that our estimates of Af gas are biased 
low at high r because the T e we use there is too high; at 
present we cannot assess the relative contributions of these 
two factors. 



7 CONCLUSIONS 

(i) Untapered, naturally-weighted AMI Small Array 
maps at 13.9-18.2 GHz, with no source subtraction, show 
clear SZ effects in five of the seven clusters. 

(ii) Using source-subtraction observations that are largely 
from the Ryle Telescope (and thus at 15 GHz but typi- 
cally two years before the SA observations), and assuming 
a spherical /3-model, hydrostatic equilibrium, and isother- 
mality with an X-ray measured temperature, our Bayesian 
analysis reveals SZ signals in all seven clusters. In six of 
these, the Bayesian evidence for an SZ detection, in addi- 
tion to sources plus CMB primary anisotropies plus thermal 
noise, is huge; in the one of them with much the worst ther- 
mal noise, there is a 1 in 3000 chance that the SZ is spurious. 
We emphasize that, to allow for variability, we set the prior 
on each source's flux density as its high-resolution value with 
a Gaussian 1-a width of (except in one case) ±30 per cent. 

(iii) The Bayesian evidence proves very useful in un- 
derstanding source environments. For example, a high- 
resolution map showed a feature that, by eye, was classed 
as a tentative radio-source detection. Running the Bayesian 
analysis twice, with and without that tentative source, 
showed that the evidence for it is in fact so low that it should 
not be included. 

(iv) We note that our sensitivity to structures out to 10', 
corresponding to a 1.7-Mpc diameter for our lowest-redshift 
cluster, means that our parameter estimates out to the clas- 
sical virial radii of the nearer clusters involve some extrap- 
olation, but no extrapolation is needed for the more-distant 
ones. 

(v) Our probability distributions of masses and radii in- 
ternal to which the average overdensities are 1000, 500 and 
200 are usefully constrained and change sensibly over this 
range. However, our gas fractions are evidently low com- 
pared with values in the literature; further, they decrease 
with increasing radius, which is also unexpected. The prob- 
lem seems consistent with the notion that temperature T e 
decreases as radius r increases whereas we are assuming is- 
sothermality (using temperatures measured from low-radii 
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Figure 4. A611 posterior probability distribution. 



SZ observations of galaxy clusters out to the virial radius 11 



200 


-200 

1000 
800 
600 
400 
200 



\J 



S 0.8 

Jg 0.6 

a 0.2 

15 
> 10 
I" 5 




-200 200 -200 200 

x /arcsec y /arcsec 



500 1000 0.5 1 
/h kpc p 



0.5 1 

,„„/h- 2 M„ 



5 10 15 
T/keV 



(a) For fitted parameters, posteriors marginalized into two dimensions, and into one 
dimension along the diagonal. 





0.2 0.4 0.6 0.8 



/rf ' Mpc 




(b) For derived parameters, posteriors marginalized into one dimension. 
Figure 5. A773 posterior probability distribution. 
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(b) For derived parameters, posteriors marginalized into one dimension. 
Figure 6. A1914 posterior probability distribution. 
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(b) For derived parameters, posteriors marginalized into one dimension. 
Figure 7. A2218 posterior probability distribution. 
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Figure 8. MACSJ0308+26 posterior probability distribution. 
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Figure 9. MACSJ0717+37 posterior probability distribution. 
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Figure 10. MACSJ0744+39 posterior probability distribution. 



SZ observations of galaxy clusters out to the virial radius 17 



Table 7. Mean a posteriori parameter estimates with 68 per cent confidence limits. Note that e x means 
10 x in this table. 
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data); the problem is made somewhat worse because, as we 
have shown, gas fraction goes as T c -2 ' 5 assuming isothermal- 
ity and hydrostatic equilibrium rather than as T~ as seems 
to have been assumed in the literature. If T e does indeed 
fall as r increases, our gas masses are biased low and our to- 
tal masses (and to a lesser extent our measurements of r a ) 
are biased high. Temperature profiles must be measured or 
some other means found to deal with this problem if we are 
to infer masses out towards the virial radius. Indeed, along 
with other density-profile models, this will be investigated 
in future work. 
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